Audio Visual Cues for Video Indexing and Retrieval
نویسندگان
چکیده
This paper studies content-based video retrieval using the combination of audio and visual features. The visual feature is extracted by an adaptive video indexing technique that places a strong emphasis on accurate characterization of spatio-temporal information within video clips. Audio feature is extracted by a statistical time-frequency analysis method that applies Laplacian mixture models to wavelet coefficients. The proposed joint audio-visual retrieval framework is highly flexible and scalable, and can be effectively applied to various types of video databases.
منابع مشابه
Assessing Semantic Relevance by Using Audiovisual Cues
This paper presents two complementary approaches for assessing semantic relevance in video retrieval—(1) adaptive video indexing and (2) elemental concept indexing. Both approaches make extensive use of audiovisual cues. In the former, retrieval is performed by using implicit semantic indices through audio and visual features. Audio features are extracted by statistical time-frequency analysis ...
متن کاملMultimedia Indexing and Retrieval Techniques: A Review
Retrieval of multimedia has become a requirement for many contemporary information systems. These systems need to provide browsing, querying, navigation, and, sometimes, composition capabilities involving various forms of media. In this survey, we review techniques for text, image, audio and video retrieval. We first look at indexing and retrieval techniques for text, audio, image and video. We...
متن کاملJoint processing of audio and visual information for multimedia indexing and human-computer interaction
Information fusion in the context of combining multiple streams of data e.g., audio streams and video streams corresponding to the same perceptual process is considered in a somewhat generalized setting. Speci cally, we consider the problem of combining visual cues with audio signals for the purpose of improved automatic machine recognition of descriptors e.g., speech recognition/transcription,...
متن کاملAudio-visual Content-based Multimedia Indexing and Retrieval – the Muvis Framework
MUVIS is a collaborative framework that supports indexing, browsing and querying of various multimedia types such as audio, video, audio/video interlaced in several formats. It allows real-time audio and video capturing, encoding by last generation codecs such as MPEG-4, H.263+, MP3 and AAC. MUVIS also supports several audio/video file format such as AVI, MP4, MP3 and AAC. MUVIS achieves a glob...
متن کاملDetection of slide transition for topic indexing
This paper presents an automatic and novel approach in detecting the transitions of slides for video sequences of technical lectures. Our approach adopts a foreground vs background segmentation algorithm to separate a presenter from the projected electronic slides. Once a background template is generated, text captions are detected and analyzed. The segmented caption regions as well as backgrou...
متن کامل